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ABSTRACT 


As higher education institutions develop fully online course 
programs to provide better access for the non-traditional learner, 
there is increasing interest in identifying students who may be at 
risk of attrition and poor performance in these online course 
programs. In our study, we investigate the effectiveness of an 
online orientation course in improving student retention in an 
online college program. Using student activity data from the 
orientation course, Engage, we make use of machine learning 
methods to develop prediction models of whether students will 
be retained and continue to register for program-specific courses 
in the eVersity program. We then discuss the implications of our 
findings on improvements that may be made to the existing 
orientation course to improve student retention in the program. 


Keywords 


Prediction modeling, online orientation course, student retention 


1. INTRODUCTION 


With the widespread development of online learning programs 
in institutes of higher learning, access to a college education has 
improved by a considerable amount. Despite increased 
enrollment rates within these online degree programs, however, 
student attrition or dropout rates also tend to be correspondingly 
higher than in traditional face-to-face degree programs [4, 21]. 
Dropout can occur early for many students in online programs; 
some students drop out even before they register for their first 
course [24]. As such, it has become increasingly important for 
facilitators and administrators to identify factors that may 
influence attrition and retention in these online course offerings, 
and implement targeted interventions to increase retention. 


Some of these targeted interventions involve the use of machine 
learning to provide timely information on student progress 
within a course to teachers and facilitators [1, 12, 17]. These 
interventions allow them to identify at-risk students earlier on 


within an online course, and take steps to encourage student 
retention. Another type of intervention involves the 
development of online orientation courses taken before the 
beginning of the program. These courses aim to provide students 
with the support and resources they may need during their 
progression through the program [3, 8]. A combination of the 
above interventions may also be implemented where machine 
learning models are developed to identify patterns in student 
behavior within online orientation courses themselves, which 
could help inform teachers and facilitators of students at risk of 
dropout even earlier on within an online program. 


In this study, we use machine learning to investigate student 
behavior within a required online orientation course, Engage, 
for students registered in an online university, eVersity. eVersity 
is a completely online course program established and 
developed by the University of Arkansas System (UAS). Using 
student data in this online orientation course, we developed a 
model that allows us to predict the likelihood of their continued 
participation in the online college program, through their 
registration in future program-specific courses. 


2. LITERATURE REVIEW 


There has been extensive research in recent years to identify 
factors that lead to low student retention rates, particularly 
within the context of online learning programs [9, 16, 25]. 
Attrition and retention can be defined in several ways. Since this 
paper is focused on an online course program that emphasizes 
learning at students’ own pace and preferred time(s), we make 
use of the definition proposed by Pascarella and Terenzini [22] 
(p.374), where retention is defined as progressive re-enrollment, 
whether continuous from one term to the next, or temporarily 
interrupted and then resumed, until completion with a degree. 


Several researchers have found that student dropout rates in 
online courses are due to a variety of circumstances, including 
personal, job, or technology-related reasons [25], and are 
typically independent of demographic factors such as gender and 
race [2, 11, 25]. Park et al. [20] also found that organizational 
support and course relevance are better predictors than 
demographic variables, and _ significantly predict student 
persistence as well as student dropout in online course 
programs. Both O’Brien & Renner [18] and Jung et al. [14] 
replicated these findings and found that online courses that 
increase opportunities for student interaction, such as group 
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work, tend to improve student engagement, thereby reducing 
student dropout. 


A popular intervention that has been implemented to improve 
student retention, based on these findings, is the development of 
orientation courses that seek to provide new students with 
organizational support, guidance, and resources that they may 
need to support their online learning. Studies have found that 
such online orientation courses can be effective at improving 
retention and the overall student learning experience [5, 8, 13]. 


Other interventions have focused on providing information to 
instructors, academic advisors, and facilitators on which 
students are at risk, so that the student can be contacted and 
better supported [1, 12, 17]. Increasingly, these types of 
interventions have been driven using automated models that can 
identify students who are at risk of dropping out or performing 
poorly, so that instructors and facilitators can focus intervention 
efforts on the students who are most likely to be benefit from an 
intervention. The use of data mining techniques has enabled 
course facilitators to identify at-risk students early on within a 
course. For instance, Dekker and colleagues [7] made use of 
data mining techniques to identify students at risk of dropping 
out from an electrical engineering program, after the first 
semester of their studies, or even before they enter the program. 
In another study, Lauria et al. [15] developed models to predict 
student performance based on course management system data 
as well as student academic records. 


Such models have then been used by higher education 
institutions to provide support through early interventions to at- 
risk students. This type of intervention has been developed and 
implemented by various universities and companies, including 
Purdue University, Marist College, Civitas Learning, and 
ZogoTech [1, 10, 12, 17]. Amold & Pistilli’s work [1], for 
example, examines the development and implementation of 
Course Signals at Purdue University. Course Signals makes use 
of learning analytics to help course faculty provide accurate 
real-time feedback to their students about whether they are on 
track to succeed in their current course. Analyses of student 
performance showed that students who participated in at least 
one Course Signals course achieved better grades and 
experience higher retention rates than their peers who did not 
participate in any Course Signals courses. Similarly, Fritz [10] 
makes use of learning analytics to develop an intervention called 
“Check My Activities”, where students are given the 
opportunity to compare their online course activity against an 
anonymous summary of their peers in the course, thus providing 
early system feedback directly to the students so that they are 
more aware of their own levels of engagement within a course. 


3. EVERSITY — ONLINE LEARNING 


The eVersity is a fully online institution for the University of 
Arkansas System, which is comprised of institutions of higher 
education across the state. The mission of eVersity is to provide 
online education specifically for adult learners; in particular, at- 
risk learners who may have previously dropped out of college 
and may require additional support to be successful 
academically. Currently the eVersity student population is 65% 
female, 69% white, 27% black or African American, and the 
average age is 36. Each academic term runs for a short 6 weeks 
to allow enrolled students maximum flexibility in fitting the 
online courses within their schedules. 


To better serve students, eVersity offers a free credit-barring 
orientation course, Engage. This course fulfills two functions, 
both related to the goal of improving student retention: to 
introduce students to the tools and information they need to be 
successful in an online learning environment, and for the 
institution to get to know its students. Engage also aims to 
provide resources and guidance to new students as they continue 
on to register in program-specific online courses within 
eVersity. Upon enrollment in the eVersity program during any 
of the seven terms throughout the year, students are 
automatically registered in Engage. Within Engage, information 
is organized into 6 Steps: Welcome, Getting to Know You, 
Funding My Future, Supporting My Academic Success, 
Developing My Learning Plan, and My Financial Plan. Students 
are free to explore the six course sections at their own pace 
within the six-week academic term. 


To ensure student participation within each section of the 
course, students are required to complete knowledge checks and 
assessments at the end of each Step before they can access the 
next Step. These assessments and checkpoints help students to 
process the information provided within each Step, and provide 
students with practice opportunities to complete work in online 
formats that will be commonly used within later program- 
specific courses, such as uploading assignments and journal 
entries, and taking online quizzes. Completion of the Engage 
course is required for students who wish to continue on to 
register for program-specific courses on eVersity. 


4. METHODS 


4.1 Orientation Course Data 

The dataset used for analysis was obtained from the Blackboard 
online learning system, and included student data from the first 
rollouts of the Engage course in the October 2015 and January 
2016 terms. As discussed above, each term spans approximately 
six weeks. The data set provided resource access information per 
student, including date accessed and page accessed, as well as 
actions performed while on these pages. Resources accessed and 
respective actions include: 


1. Journals: add journal entry, view draft, edit journal 
entry 


2. Assessments: launch assessment, review attempt, save 
attempt, submit assessment 


3. Assignments: upload assignment 
4. Discussion Boards: discussion entry, discussion reply 


5. Messages: view messages, email instructor, email 
select students 


6. Gradebook: check grade 


We also obtained demographic data consisting of each student’s 
age, gender, race, whether or not their parents attended college, 
and whether or not they registered for a class in any of the three 
academic terms immediately following the completion of the 
Engage orientation course. Of the cohort, a total of 151 students 
registered for courses after completing the Engage orientation 
course. 


We then built a prediction model to identify which student 
features are more strongly associated with future registration in 
for-credit courses on eVersity. 
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4.2 Data Cleaning and Feature Generation 
The data set obtained from eVersity included resource access 
data, and demographic and enrollment data. It represented 
97,298 page accesses and actions across 325 students. 


During their use of Engage, these students interacted with 
course content (i.e., video lectures), journals, assessments in the 
form of online quizzes, assignments, discussion boards, 
messages, and the gradebook. Each transaction within the access 
log contained a user ID, date stamp (with no time data 
available), page accessed, and, where relevant, the action 
performed. 


The features investigated in this study included: 


1. Total counts — total number of times student accessed 
each resource regardless of what action they 
performed (e.g., total count for journal access is the 
sum of the total count of journal access to write a new 
post and the total count of journal access to edit an 
existing post) 


2. Days till first access — number of days since start of 
interaction until a student accessed any of the 
resources and performed each of their specific actions 


3. Days between — average number of days between 
specific resources accesses and actions performed 
(e.g., average number of days between two journal 
views, average number of days between creation of a 
journal post and editing or submitting the same 
journal post) 


4. Inactivity — average number of days inactive (i.e., 
number of days between any two transactions) 


5. Descriptive statistics — average, standard deviation, 
minimum, and maximum values per resource access 
across days the student interacted with Engage 


In calculating these features, we excluded behaviors that were 
required to complete the Engage course. Completing the Engage 
course was required in order for a student to continue on to 
register for a program-specific course, so any feature required to 
complete Engage would be tautologically connected to 
registering for a program-specific course. Specifically, we 
excluded student activity around completing assessments, 
uploading assignments, and adding journal entries. We thus 
removed these features in order to identify other student actions 
that may be related to future student registration in an eVersity 
course, but are not explicitly required for the student to register 
in an eVersity course. 


4.3 Prediction Modeling 


Prediction models of student activity were created using 
RapidMiner 5.3 in order to determine which combination best 
predicts whether a student will register in a program-specific 
course after completing Engage. We attempted to predict this 
variable using J-Rip classification and J-48 decision trees, with 
10-fold student-level cross-validation. Cross-validation splits 
the data points into N equal-size groups. In the case of the 
current study, data points were split into 10 groups. It then trains 
on all groups but one, and tests on the last group, and does so 
for each possible combination. 


J-48 decision trees, the RapidMiner Weka Expansion Pack 
implementation of the C4.5 algorithm, can handle both 


numerical and categorical predictor variables. The algorithm 
repeatedly looks for the feature which best splits the data in 
terms of predictive power for each variable. It later prunes out 
branches that turn out to have low predictive power. Different 
branches can have different sets of features. In cases where 
numerical predictors are used, the algorithm tries to find the 
optimal split. J-Rip is the RapidMiner Weka Expansion Pack 
implementation of the Repeated Incremental Pruning to Produce 
Error Reduction (RIPPER) [6], a propositional rule learner. J- 
Rip produces a set of rules, through stages of growing and 
pruning, that account for all classes and minimizes error. 


Model variable selection was conducted using forward selection, 
where the feature that most increases fit is added to the current 
model, until no additional features improve the model. The 
resultant models’ performance was assessed using Cohen’s 
Kappa and AUC ROC. Kappa indicates the degree to which the 
detector is better than chance at identifying a modeled construct. 
0 means that the model is no better than chance, and 1 means 
perfect performance. AUC ROC is the area under the ROC 
curve, and is also the probability that given 1 instance of 
‘registered’ and 1 instance of ‘not registered’, the model is able 
to tell which instance is which. It is computed using the A’ 
implementation to control for artificially high AUC ROC 
estimates due to having multiple data points with the same 
confidence. An AUC ROC value of 0.5 indicates chance level of 
performance, while a value of 1 means perfect accuracy. 


4.4 Demographic Cross-Validation 

Some prior research has shown that prediction models may have 
different levels of accuracy for different subgroups within the 
data set [19]. To determine whether this was a concern, we 
evaluated the performance of the models across different 
demographic groups in our data set. After the models had been 
developed and cross-validated, we took the model’s prediction 
on the test sets and evaluated their performance on sub sets of 
the data based on the different demographic groups in our 
sample. In particular, we compared the performance of the 
model by gender (male versus female), race (white versus 
African-American) and parents’ college education (parents 
attended college versus parents did not attend college). In 
addition to the majority of white and African-American students 
analyzed, 7 students were Native American. This number of 
students was insufficient to allow for a valid calculation. We 
then calculated performance metrics for each of these 
demographic groups. 


5. RESULTS 


5.1 Model and Performance 

Prediction models created using the W-J48 and W-JRip 
classification algorithms resulted in high kappa and AUC 
values. Both algorithms used resulted in comparably high 
performance. As such, we will discuss both of these models 
below. The full set of models run and their respective 
performance values can be found in Table 1. 


Table 1. Cross-validated performance of models of student 
enrollment with different classification algorithms 


Classifier Kappa AUC 
J-48 0.806 0.925 
J-Rip 0.825 0.913 


Proceedings of the 10th International Conference on Educational Data Mining 252 


5.1.1 J-48 Model 

With the J-48 model, a total of four features were selected in 
some folds of the cross-validation, but not all of them were 
selected in the final model fit on all data: 


e number of days before grades were first checked by 
the student, 


e minimum number of times grades were checked by the 
student, 


e total number of views of online messages within the 
course platform, and 


e total number of views of the Discussion Board Reply 
page. 

The four features initially selected in some of the cross- 
validation folds indicate that students who checked their course 
grades earlier and more frequently, responded more to 
discussion board posts, and viewed in-course messages more 
frequently were more likely to register in a program-specific 
eVersity course after completing Engage. 


The final decision tree generated using this algorithm contained 
3 leaf nodes and 2 decision nodes. The decision tree generated 
by the prediction model is shown in Figure 1. 


As can be seen in the figure, only 2 of the selected features had 
strong enough associations with future course registration to be 
included in the pruned decision tree built on all data: Number of 
views of the Discussion Board Reply page, and the number of 
days till the first time the student checks their course grades. 


The decision tree generated with the J-48 model, shown in 
Figure 1, provides an indication of how each student’s future 
course registration is predicted, and the confidence level 
assessed for each student’s prediction. 


Total discussion board 
replies (views) 


Days till first time student 
checks grade 


Future course registration 
prediction: 0 
Confidence: 98.8% 


Future course registration 
prediction: 1 
Confidence: 86.4% 


Future course registration 
prediction: 0 
Confidence: 99.1% 


Figure 1: Visual representation of the decision tree generated 
by the J-48 algorithm 


The decision tree in Figure 1 shows that a student who has made 
fewer attempts to respond in the discussion board is less likely 
to register in a program-specific course in the future, with a 
confidence of 98.8%. Similarly, we can see that students who 
checked their course grades earlier on during the term were 
more likely to register for a program-specific course afterwards, 
with a confidence of 86.4%. In contrast, students who only 
viewed their course grades much later after the start of the 


orientation course or not at all had a 99.1% confidence of not 
registering for another eVersity course in the future. 


5.1.2 J-Rip Model 

In the J-Rip model, on the other hand, only one feature was 
selected: the total number of views of the Discussion Board 
Reply page. Based on the J-Rip model classification rules, 
students who viewed the Discussion Board Reply page more 
often (>= 3 times) within the duration of the orientation course 
had a higher probability of registering in an eVersity course 
afterwards, with a confidence of 82.4%. In contrast, students 
who viewed the Discussion Board Reply page 3 times or fewer 
during the course had a lower likelihood of registering in 
another course later on, with a confidence of 98.8%. 


The J-48 and J-Rip models obtained comparable performance 
metrics, with the J-48 model having a marginally higher AUC 
value than the J-Rip model, and the J-Rip model having a 
slightly higher Kappa value than its J-48 counterpart. This 
implies that the J-Rip model had a higher proportion of correct 
predictions when thresholded, but because only one 
classification rule was selected, there were only 2 confidence 
values that were associated with these predictions, hence 
resulting in a lower AUC value. In contrast, more features were 
selected in the J-48 model (and more differentiations were 
made), which could explain the slightly higher AUC value for 
that model than the J-Rip model. 


5.2 Performance for Demographic Groups 
We then tested both the cross-validated predictions models by 
three sets of demographic comparisons: gender (male .vs. 
female), race (white .vs. African-American) and whether the 
student’s parents attended college or not. For the J-48 model, we 
found that it performed relatively well across all the 
demographic groups tested, and close to the performance values 
obtained in the overall model. The model performances of the 
various demographic groups are listed in Table 2 below. Our J- 
48 model performed at similar levels for most of the 
demographic groups that were tested. However, it performed 
marginally worse for African-American students 
(Kappa = 0.728, AUC = 0.905). When compared to the model’s 
performance on the full data set (Kappa = 0.806, AUC = 0.925), 
its performance was still quite good in absolute terms even for 
this group. 


Table 2. Performance of J-48 models of student enrollment 
for different demographic groups 


Group Kappa AUC 

Female 0.833 0.894 

Male 0.753 0.946 
African-American 0.728 0.905 
White 0.826 0.932 

Parents attended college 0.763 0.908 
Parents did not attend college 0.829 0.933 


Similarly, we found that our J-Rip model performed at 
comparable levels of performance across different demographics 
when compared to performance on the full data set. As with the 
J-48 model, the J-Rip model was least accurate for African- 
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American students, but still obtained good predictions, with 
Kappa = 0.748, AUC = 0.907. 


Table 3. Performance of J-Rip models of student enrollment 
for different demographic groups 


Group Kappa AUC 

Female 0.833 0.937 

Male 0.811 0.875 
African-American 0.748 0.907 
White 0.751 0.906 

Parents attended college 0.774 0.896 
Parents did not attend college 0.854 0.921 


These findings suggest that the models obtained here are reliable 
across demographic groups, indicating that they can be used 
without concern regarding equity in their predictions. 


6. DISCUSSION 


To increase access to higher education for non-traditional 
students, institutions of higher learning have increasingly 
embraced online learning platforms to provide greater flexibility 
for working adults looking to return to school. Despite easier 
access, student retention and attrition has remained an important 
issue that online orientation courses like Engage aim to address. 


In our study on students taking the orientation course Engage, 
we generated a total of 139 features based on student actions 
within the Blackboard course platform and developed models to 
predict future student registration in a program-specific for- 
credit course within the state of Arkansas’s online eVersity. The 
features selected by our model were able to predict with high 
confidence levels the likelihood that students would register in a 
program-specific course after the orientation course. It is also 
notable that both the J-48 and J-Rip models selected the same 
feature (total number of views of Discussion Board Reply page) 
to be positively associated with future course registration. This 
finding echoes and provides support for earlier research 
suggesting that student participation in discussion boards is 
associated with better retention and achievement [18, 23]. 


The features selected in both our models, while not surprising, 
provide important implications that help guide administrators 
and facilitators to design interventions that can better identify at- 
risk students who may not continue on after the orientation 
course. For instance, the feature of discussion board reply views 
appeared to have a very strong association with future 
registration in an eVersity course. According to previous 
research, students’ interactions within a course help improve 
student retention rates [14, 23]. Students who accessed the 
Discussion Board Reply page more often are more likely to be 
interacting with other students and course facilitators. In this 
manner, these students may experience greater engagement in 
the course and the eVersity program, which in tum could 
explain the association between the students’ usage of the 
discussion board and future course registration within eVersity. 


Within the J-48 model, three other features were selected in 
addition to discussion board reply views. The total number of 
views of the Messages page was also included in some models 
during cross-validation, even though it was not included in the 


final decision tree built on the entire data set. Like the 
Discussion Board Reply page views feature, this feature 
suggests that students who have more interactions with other 
students and course facilitators are more likely to register in 
another eVersity course afterwards. 


Features on the number of days and frequency of the student 
checking of course grades appear to have positive associations 
with future course registration as well. From the decision tree 
generated with the J-48 algorithm, students that only view their 
course grades after a long period of time have a high likelihood 
of not registering for another eVersity course in the future. This 
can be another useful indicator of students who may not be as 
engaged in the eVersity program and their achievement in the 
orientation course, and who have a lower likelihood of 
registering for another eVersity course. 


After developing our models, we tested their reliability across 
different demographic groups. We found that the models 
performed equally well across students of different race and 
gender, as well as between groups of students with parents who 
attended or did not attend college. These findings suggest that 
our model is not overtly biased towards or against a specific 
demographic group. 


Based on our models’ performance and the features selected, 
course administrators and facilitators could make further 
improvements to Engage to increase student retention in the 
online eVersity program. Since some of the selected features 
involve student interactions, course facilitators could try to 
embed more interactive activities within Engage to encourage 
students to reach out to their peers as well as to the program 
facilitators, and participate more actively in eVersity’s social 
community. Given that discussion board views had high 
predictive power for future course registration within eVersity, 
Engage course facilitators could encourage student participation 
in discussion boards early on in the course, and maintain a 
stronger presence within discussion boards to provide a more 
robust and consistent form of support for students embarking on 
the eVersity program. Nevertheless, it is worth noting that 
student participation in discussion boards may also be a proxy 
for student interest in the course content or their overall goal of 
studying within eVersity. Actions taken by course facilitators to 
encourage student participation in discussion boards may not be 
as helpful in increasing student engagement or interest in the 
course content. Alternatively, it may be more effective for 
course facilitators to tweak the discussion board activities to 
ensure that they are optimally interesting and relevant to the 
learners participating in the orientation course. 


7. CONCLUSION 


In this study, we made use of student interaction data from a 
credit-baring online orientation course, Engage, in a completely 
online university, to build a prediction model of student 
registration in future program-specific courses. The prediction 
models were developed using machine learning algorithms and 
tested across different demographic groups. Two algorithms 
were tested; the performance of both models was high, and the 
models provide indicators that predict future student registration 
in program-specific courses within the online eVersity program. 
These prediction models thus provide eVersity administrators 
and course facilitators with fine-grained information on student 
behavior within the orientation course that could improve 
student retention on eVersity. As such, further improvements 
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could be made to the orientation course Engage to accurately 
target students at risk of dropping out of the online eVersity 
program, and provide further support to these students at an 
earlier stage in their higher education journey. 


8. ACKNOWLEDGEMENTS 

We would like to thank the Bill and Melinda Gates Foundation 
and the DLRN network for their support for our work. In 
addition, we would like to thank George Siemens, Candace 
Thille, Carolyn Rose, Carol Lashman, and Anita Crawley, for 
their helpful suggestions. 


9. REFERENCES 


[1] 


[2] 


[3] 


[4] 


[5] 


[6] 


[7] 


[8] 


[9] 


[10] 


[11] 


Arnold, K.E. et al. 2012. Course signals at Purdue: 
Using learning analytics to increase student success. 
2nd International Conference on Learning Analytics 
and Knowledge. May (2012), 2-5. 


Boston, W.E. et al. 2011. Comprehensive Assessment 
of Student Retention in Online Learning Environments. 
School of Arts and Humanities, APUS. Paper 1 (2011). 


Brewer, S. a. and Yucedag-Ozcan, A. 2012. 
Educational persistence: Self-efficacy and topics in a 
college orientation course. Journal of College Student 
Retention: Research, Theory and Practice. 14, 4 
(2012), 451-465. 


Carr, S. 2000. As distance education comes of age, the 
challenge is keeping the students. Chronicle of Higher 
Education. 46, 23 (2000). 


Carruth, A. K.; Broussard, P. C.; Waldmeier, V. P.; 
Gauthier, D. M.; Mixon, G. 2014. Graduate Nursing 
Online Orientation Course: Transitioning for Success. 
Journal of Nursing Education. 49, March (2014), 14— 
17. 


Cohen, W.W. 1995. Fast effective rule induction. 
Twelfth International Conference on Machine 
Learning. (1995), 115-123. 


Dekker, G.W.. et al. 2009. Predicting students drop out: 
A case study. EDM’09 - Educational Data Mining 
2009: 2nd International Conference on Educational 
Data Mining. (2009), 41-50. 


Derby, D.C. and Smith, T. 2004. An orientation course 
and community college retention. Community College 
Journal of Research and Practice. 28, 9 (2004), 763— 
773. 


Fike, D.S. and Fike, R. 2008. Predictors of First-Year 
Student Retention in the Community College. 
Community College Review. 36, 2 (2008), 68-88. 


Fritz, J. 2011. Classroom walls that talk: Using online 
course activity data of successful students to raise self- 
awareness of underperforming peers. Internet and 
Higher Education. 14, 2 (2011), 89-97. 


Hoskins, S.L. and Hooff, J.C. Van 2005. Motivation 
and ability: Which students use online learning and 
what influence does it have on their achievement ? 
Communications. 36, 2 (2005). 


[12] 


[13] 


[14] 


[15] 


[16] 


[17] 


[18] 


[19] 


[20] 


[21] 


[22] 


[23] 


[24] 


[25] 


Jayaprakash, S.M. et al. 2014. Early alert of 
academically at-risk students: An open source analytics 
initiative. Journal of Learning Analytics. 1, 1 (2014), 
6-47. 


Jones, K.R. 2013. Developing and implementing a 
mandatory online student orientation. Journal of 
Asynchronous Learning Networks. 17, 1 (2013), 43-45. 


Jung, I. et al. 2010. Effects of different types of 
interaction on learning achievement, satisfaction and 
participation in web-based instruction. Innovations in 
Education and Teaching International. 39, 2 (2010), 
153-162. 


Lauria, E.J.M. et al. 2012. Mining academic data to 
improve college student retention: An open source 
perspective. Proceedings of the Second International 
Conference on Learning Analytics And Knowledge - 
LAK ’12. May (2012), 139-142. 


Lee, Y. and Choi, J. 2011. A review of online course 
dropout research: Implications for practice and future 
research. Educational Technology Research and 
Development. 59, 5 (2011), 593-618. 


Milliron, M.D. et al. 2014. Insight and action analytics: 
Three case studies to consider. Research and Practice 
in Assessment. 9, (2014), 70-89. 


O’Brien, B. and Renner, A.L. 2002. Online student 
retention: Can it be done? World Conference on 
Educational Multimedia, Hypermedia and 
Telecommunications (2002). 


Ocumpaugh, J. et al. 2014. Population validity for 
Educational Data Mining models: A case study in affect 
detection. British Journal of Educational Technology. 
45, 3 (2014), 487-501. 


Park, J.-H. and Choi, H.J. 2009. Factors Influencing 
Adult Learners‘Decision to Drop Out or Persist in 
Online Learning. Educational Technology & Society. 
12, 4 (2009), 207-217. 


Parker, A. 1999. A study of variables that predict 
dropout from distance education. International Journal 
of Educational Technology. 1, 2 (1999), 1-10. 


Pascarella, E.T. and Terenzini, P.T. 2005. How college 
affects students: A third decade of research. How 
College Affects Students: A Third Decade of Research. 


Roberts, J. and Styron, R. 2010. Student satisfaction 
and persistence: factors vital to student retention. 
Research in Higher Education Journal. 6, 3 (2010), 1- 
18. 


Tyler-Smith, K. 2006. Early Attrition among first time 
eLearners: A review of factors that contribute to drop- 
out, withdrawal and non-completion rates of adult 
learners undertaking eLearning programmes. Journal of 
Online Learning and Teaching. 2, 2 (2006), 73-85. 


Willging, P.A. and Johnson, S.D. 2009. Factors that 
influence students’ decision to dropout of online 
courses. Journal of Asynchronous Learning Network. 
13, 3 (2009), 115-127. 


Proceedings of the 10th International Conference on Educational Data Mining 255 


